RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus

نویسندگان

  • Tiberiu Boros
  • Adriana Stan
  • Oliver Watts
  • Stefan Daniel Dumitrescu
چکیده

This paper introduces a recent development of a Romanian Speech corpus to include prosodic annotations of the speech data in the form of ToBI labels. We describe the methodology of determining the required pitch patterns that are common for the Romanian language, annotate the speech resource, and then provide a comparison of two text-to-speech synthesis systems to establish the benefits of using this type of information to our speech resource. The result is a publicly available speech dataset which can be used to further develop speech synthesis systems or to automatically learn the prediction of ToBI labels from text in Romanian language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate

This paper first introduces a newly-recorded high quality Romanian speech corpus designed for speech synthesis, called “RSS”, along with Romanian front-end text processing modules and HMM-based synthetic voices built from the corpus. All of these are now freely available for academic use in order to promote Romanian speech technology research. The RSS corpus comprises 3500 training sentences an...

متن کامل

A prosodically labeled database of spontaneous speech

This paper describes a prosodically labeled database of conversational speech, representing a subset of the Switchboard and Callhome corpora. The prosodic transcription system is a simplification of the ToBI system aimed at phenomena that would be most useful for automatic transcription and linguistic analysis of conversational speech. The transcription method and a distributional analysis of t...

متن کامل

Automatic labelling of German prosody

One limitation in prosody research is the lack of sufficient prosodically labelled speech data. In this paper, we present research on an automatic labelling system that is able to produce a phonological tonal labelling according to the ToBI like intonation model for German developed by Féry. The system is not totally dependent on the specific language and/or labelling system, as it uses corpus ...

متن کامل

GREEK ToBI: A System for the Annotation of Greek Speech Corpora

Greek ToBI is a system for the annotation of (Standard) Greek spoken corpora, that encodes intonational, prosodic and phonetic information. It is used to develop a large and publicly available database of prosodically annotated utterances for research, engineering and educational purposes. Greek ToBI is based on the system developed for American English (ToBI), but includes novel features (“tie...

متن کامل

A Prosodic Diphone Database for Korean Text-to-Speech Synthesis System

This paper presents a prosodically conditioned diphone database to be used in a Korean text-to-speech (TTS) synthesis system. The diphones are prosodically conditioned in the sense that a single conventional diphone is stored as different versions taken directly from the different prosodic domains of the prosodically labeled, read sentences (following the K-ToBI prosodic labeling conventions [3...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014